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ABSTRACT 


One of the most powerful laws in physics is the second law of thermodynamics, which states that the entropy of any system remains con- 
stant or increases over time. In fact, the second law is applicable to the evolution of the entire universe and Clausius stated, “The entropy 
of the universe tends to a maximum.” Here, we examine the time evolution of information systems, defined as physical systems con- 
taining information states within Shannon’s information theory framework. Our observations allow the introduction of the second law 
of information dynamics (infodynamics). Using two different information systems, digital data storage and a biological RNA genome, 
we demonstrate that the second law of infodynamics requires the information entropy to remain constant or to decrease over time. This 
is exactly the opposite to the evolution of the physical entropy, as dictated by the second law of thermodynamics. The surprising result 
obtained here has massive implications for future developments in genomic research, evolutionary biology, computing, big data, physics, and 
cosmology. 


© 2022 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license 


(http://creativecommons.org/licenses/by/4.0/). https://doi.org/10.1063/5.0100358 


Il. INTRODUCTION 


The research field of information dynamics (infodynamics) has 
its origins in a few significant scientific developments that include 
the seminal Information Theory developed by Shannon in 1941! 
and the pioneering work of Brillouin in 1953° and Landauer in 
1961° on information physics. A more recent development is the 
introduction of the mass-energy-information (M-E-I) equivalence 
principle formulated by Vopson in 2019.' Using thermodynamic 
considerations, Landauer introduced his 1961 principle stating that 
information, as defined in Shannon’s framework, is not just a math- 
ematical construct, but it is physical, having small energy associated 
with it, which is detectable at information erasure. Backed up by 
multiple experimental confirmations reported in the literature,” 
Landauer’s principle passed long ago the theoretical realm and the 
scientific community today broadly accepts it as valid. The M-E-I 
equivalence principle proposed in 2019 is an extension of Landauer’s 
principle stating that, if information is equivalent to energy, accord- 
ing to Landauer, and if energy is equivalent to mass, according to 
Einstein’s special relativity, then the triad of mass, energy, and infor- 
mation must all be equivalent, too (ie., if M = E and E = I, then 
M = E = J). The M-E-I equivalence principle generated a 
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number of interesting ramifications in physics,” '' but still awaits an 
experimental confirmation.'* The Landauer and M-E-I equivalence 
principles are necessary in order to fulfill the thermodynamic laws of 
physics. These principles have been initially proposed in the context 
of digital information and computing technologies. This is because 
any computational or memory device is essentially a physical sys- 
tem, which is part of the universe and it must obey the universal laws 
of physics, including thermodynamics. Due to these considerations, 
Landauer suggested that logical irreversibility must be equivalent to 
physical irreversibility. Because irreversible processes are also dissi- 
pative, ie., they take place with dissipation of energy, and since the 
erase operation that deletes a bit of information is irreversible, then 
it must dissipate a small energy that comes from the memory bit 
itself. Hence, Landauer deduced that a bit of information is physical, 
or more generally, any form of information as defined in Shannon’s 
framework is physical. The M-E-I equivalence principle proposes 
that the Landauer energy of an information bit condenses into its 
equivalent mass-energy when the information is stored at equilib- 
rium. These fundamental ideas have created a bridge between pure 
mathematics and physics, essentially “physicalizing” mathematics. 
The concept of physicalizing the mathematics has profound impli- 
cations for the way that we think about the whole universe, because 
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it shows that the universe is fundamentally mathematical and it can 
be seen as emerging from information, i.e., “it from bit,” a concept 
coined by the legendary physicist, Wheeler.'’ 

Here, we examine the entropy and the time dynamics of infor- 
mation systems and, in analogy to the second law of thermodynam- 
ics, we formulate the second law of infodynamics. 


Il. ENTROPY OF INFORMATION 


Let us assume a physical system is in its virgin state with no 
information stored in it [Fig. 1(a)]. We now assume that the sys- 
tem undergoes the process of encoding digital bits of information 
via a given process of digital information storage. The technology 
deployed to encode digital information is irrelevant to our discus- 
sion, but we will demonstrate our argument here using a magnetic 
data storage system. The total entropy of the system is a measure of 
all its possible physical microstates compatible with the macrostate, 
and we call this the physical entropy of the system, S,,,.. The phys- 
ical entropy of the system is characteristic of the non-information 
bearing microstates within the system. We now assume that N dig- 
ital bits of information are created within the physical body. This is 
equivalent to the “write” operation of a digital data storage device. 
The additional N bits of information created within our test system 
represent N additional microstates superimposed onto the existing 
physical microstates. 

These additional microstates are information bearing states, 
and the additional entropy associated with them is called the entropy 
of information, Sin. 

The total entropy of the system is now the sum of the ini- 
tial physical entropy and the newly created entropy of information, 
Stot = Sphys + Sing. Hence, an important observation is that the process 
of creating information increases the overall entropy of a given sys- 
tem. In our example, we write digitally onto our hypothetical system 
the word INFORMATION using magnetic data recording, so a dig- 
ital 0 is blue (magnetization up) and a digital 1 is red (magnetization 
down) [Figs. 1(b) and 1(c)]. 
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In binary code, this results in 11 bytes, so N = 88 bits of 0 and 1 
states, are encoded [Fig. 1(c)]. The evolution of the physical entropy 
and the total entropy of our test system are both governed by the 
second law of thermodynamics. The second law of thermodynamics 
has many alternative formulations, but in this context, we will use 
the one stating that the entropy of an isolated system undergoing 
any transformation remains always constant or increases over time. 
When applied to the whole universe, Clausius definition states, “The 
entropy of the universe tends to a maximum.” Mathematically, this 
formulation of the second law is written as OS/Ot > 0, where S is the 
total entropy and t is time. 

Let us now examine the applicability of the second law to the 
entropy of the information bearing states. To do so, we need to use 
Shannon’s information theory developed in 1940s.' Shannon gave 
the mathematical framework of classical information theory by link- 
ing the occurrence probability of an event to its information content. 
According to Shannon, for an event whose probability of occur- 
ring is p, the information extracted from observing the event is a 
continuous function of its probability: 


I(p) = log,(1/p); (1) 


where | is the information value and its units are given by the choice 
of the base, b. 

Units of bits are obtained when b = 2, trits when b = 3, 
nats when b = e, ie., Euler’s number. The natural choice of 
b = 2 resulting in bits, which is the case in this article, is dictated 
by the current digital technologies making this a convenient choice. 
For an arbitrary choice of the base “b,” the information function 
can be returned in different units using the logarithm base change 


formula, I(p) = log, (1/p) = See For example, if we want to 


convert information expressed in nuts into bits, then b = 2, a =e, 
and I(p) = In(1/p) (nuts) = In(2) - log, (1/p) (bits). 

When we observe a set of n independent and distinctive events 
X = {x1, x2, ..., Xn} having a discrete probability distribution 
P = {pi, po, ..., pn} on X, the average bit information content per 


FIG. 1. (a) Schematics of a material in virgin state with no information stored in it; (b) the word INFORMATION is written on the material in binary code using magnetic 
recording; and (c) the grid of 0 and 1 information states created in the process of information recording. 
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event that can be extracted when observing the set of events X 
once is 

Z 1 

A(X) = 2 Pi log. 


jel di 


(2) 


The function H(X) is called the Shannon information entropy 
and it is maximum when the events x; have equal probabilities 
of occurring, p; = 1 /n, so H(x) = logs n. The Shannon information 
entropy function returns a bit content value per event. When observ- 
ing N sets of events X, or N times the set of events X, the number 
of bits of information extracted from the observation is N-H(X). 
The Shannon information entropy H(X) is closely linked to the 
entropy of the information bearing states, S;,r. If N digital bits are 
created within the physical body, then the additional possible states, 
also known as distinct messages in Shannon’s original formalism, 
are equivalent to the number of information bearing microstates, 0 
compatible with the macrostate:" 


ee alae A (3) 


Using (2) and (3), we can deduce the entropy of the information 
bearing states from the Boltzmann relation, 
Sig = ky In. = Nky-In 2-Y”py-log, =, (4) 
jel J 


where ky, = 1.38064 x 10°”? J/K is the Boltzmann constant. 

In the context of Shannon’s information theory, our test exam- 
ple shown in Fig. 1, has N = 88, n = 2 and X = {0,1}. Counting the 
occurrences of 0 and 1 s, we get 49 and 39, respectively. This results 
in P = {p1, p2} = {49/88, 39/88}. If the two events would have equal 
probabilities, ie., P= {pi,po} = {44/88, 44/88} = {1/2,1/2}, then 
using (2) it can be shown that H(X) = 1, or an average of one bit 
of information is encoded per each state. For our example, however, 
the Shannon entropy function is just under 1 bit, 


n 1 
H(X) = )' pj log, — 
jal Pj 


49, (88) 39, (88 
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Our objective in this study is to examine the time evolution of 
Sing- According to (4), only two variables, N and H(X), can drive any 
changes in the Sj,¢. The Shannon function has a maximum value, 
which is 1 in our case and it tends to 1 for large N. 


Ill. TIME EVOLUTION OF DIGITAL INFORMATION 
STATES 


Let us assume that H(X) > 1, and we now examine what 
the evolution of N is over time. In our example, N information 
microstates are physically determined by magnetization changes in 
the material [see Figs. 1(b) and 1(c)]. For any given non-zero Kelvin 
temperature T, these magnetic states will undergo magnetic relax- 
ation processes with a relaxation time (r) given by the well-known 
Arrhenius—Neel equation, 


1 _ Kav 
rs =voe *, (6) 
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where vo is the magnetization attempt frequency to overcome the 
energy barrier and it is approximately vo ~ 10° Hz, Ky is the 
anisotropy constant of the magnetic material, V is the magnetic 
grain volume, and k, is the Boltzmann constant. 

The meaning of this relaxation time is the average time it 
takes for a magnetic grain of volume V within a magnetic bit 
state to undergo a spontaneous magnetization flip due to the ther- 
mal activation. Hence, after a sufficiently long time, we expect 
magnetic grains to lose their magnetization state, leading to mag- 
netic bit states undergoing self-erasure, and, therefore, reducing the 
information states N. The implication of this analysis is that the 
entropy of the information bearing states tends to decrease over 
time. 

To demonstrate this, we simulated a granular magnetic thin 
film structure with perpendicular uniaxial anisotropy of Ka = 8.75 
x 10° Jim?, saturation magnetization M, = 1710 kA/m, and aver- 
age unit cell size (cubic) V = 10°?” m’, at room temperature 
T = 300 K. A standard cell size volume suitable for magnetic record- 
ing should be sufficiently large to maintain thermally stable magne- 
tization of the cell for ~10 years [i-e., rt in Eq. (6) is 3.15 x 10° s]. 
Under this condition, the ratio of magnetocrystalline energy to ther- 


mal energy at T = 300 K should be around 40, jaan = 40, resulting in 
b 


a cell size volume of V * 1.9 x 10°” m’. The unit cell size volume 
in our simulations has been intentionally taken 1.9 times lower in 
order to speed up the computation time. These values resulted in a 
relaxation time of 1.5 s, which corresponds to a single iteration in 
the Monte Carlo algorithm. The simulated thin film sample size was 
400 x 550 x 2nm? , giving a bit size of 50 x 50 nm?. Starting with a 
thermalized random state [Fig. 2(a)], INFORMATION is suddenly 
written using the digital binary code [Figs. 1(b) and 2(b)]. Using a 
micromagnetic Monte Carlo algorithm,’’ we tracked the informa- 
tion loss as the system is allowed to thermalize over a period of time 
[Figs. 2(c)-2(h)]. The data indicate that the system evolves over time 
in a way that the second law of thermodynamics is indeed fulfilled 
by the physical entropy and the total entropy of the system. How- 
ever, when the entropy of the information bearing states is examined 
independently, we conclude that the second law manifests in reverse 
so that the information entropy stays constant or decreases. This is 
called the second law of information dynamics (infodynamics), and 
it is mathematically written as 


OSing < 


Bt 0. (7) 


This new law of infodynamics must not violate the second law 
of thermodynamics, so the entropy reduction in the information 
states must be compensated by an entropy increase in the physi- 
cal states, via a dissipation mechanism. This rationale was behind 
Landauer’s principle that information is physical, which was also 
derived similarly and expanded to the mass-energy-information 
equivalence principle by Vopson.* 

The simulation performed on our test sample resulted in a 
simultaneous reduction of the magnetization of all the magnetic 
information bit states up to the point when N = 0. However, in 
reality, this process can take place gradually so that N reduces to a 
lower value in steps, until it reaches zero eventually. Re-examining 
relation (4), it can be easily seen that a reduction in N would lead toa 
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(g) 


FIG. 2. Time evolution of the digital magnetic recording information states simulated using Micromagnetic Monte Carlo. Over time, the information states gradually vanish 
due to self-erasure, reducing the information entropy of the system. Red denotes magnetization pointing out of the plane and blue is magnetization pointing into the plane. 
(a) Initial random state; (b) INFORMATION is written (t = 0 s); (c) Iteration 140 (t = 210 s); (d) Iteration 460 (t = 690 s); (e) Iteration 590 (t = 885 s); (f) Iteration 930 


(t = 1395 s); (g) Iteration 1100 (t = 1650 s); and (h) Iteration 1990 (t = 2985 s). 


reduction of the information entropy, confirming indeed the second 
law of infodynamics (7). 


IV. TIME EVOLUTION OF BIOLOGICAL INFORMATION 
STATES 


In order to verify the universal validity of the second law 
of infodynamics, we need to examine the time evolution of 
the information entropy of a system, in which the number of 
information states N remains constant and the reduction of the 
information entropy comes from Shannon’s information entropy 
function. 

A natural information coding system that fulfills this require- 
ment is the genetic DNA/RNA code, because the information is 
encoded in the sequence of nucleotides and its time evolution is 
described by the genetic mutations. Genetic mutations are changes 
in the nucleotide sequence, and these changes can take place via 
three mechanisms: (i) Single nucleotide polymorphisms (SNPs), 
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where changes occur so that the number of nucleotides remains con- 
stant; (ii) Deletions, where N decreases; and (iii) Insertions, which 
result in N increasing. Out of the three possible cases, only the SNP 
mutations are of interest to us, because they maintain the value of 
the N constant. 

The ideal test system is a virus genome that undergoes frequent 
mutations in a short period of time. In this study, we examined the 
RNA sequence of the novel SARS-CoV-2 virus, which emerged in 
December 2019 resulting in the current COVID-19 pandemic. 

A DNA sequence can be represented as a long string of the 
letters A, C, G, and T. These represent the four nucleotides: ade- 
nine (A), cytosine (C), guanine (G), and thymine (T) [replaced 
with uracil (U) in RNA sequences]. Therefore, within Shannon’s 
information theory framework, a typical genome sequence can be 
represented as a 4-state probabilistic system, with n = 4 distinc- 
tive events, X = {A,C,G,T} and probabilities p = {pa,pc,pa.pr}. 
Using digital information units and Eq. (2), for n = 4, we deter- 
mine that Shannon information entropy is 2 (H = log, 4 = 2), so 
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TABLE |. Tabulated results of the analysis performed on selected SARS-CoV-2 variants sequenced at various locations 


around the globe, over a period of 22 months. 


Genome References SNPs Time (months) Location Shannon IE 
MN908947 15 0 0 China 1.957 0243 
LC542809 19 4 3 Japan 1.956 919 7 
MT956915 20 7. 5 Spain 1.956 923 0 
MW466798 21 9 7 South Korea 1.956 932 7 
MwW294011 22 19 10 Ecuador 1.956 705 8 
MW679505 23 25 14 USA 1.956 663 0 
MW735975 24 26 14 USA 1.956571 4 
OK546282.1 29 32 16 USA 1.956 5675 
OK104651.1 26 40 20 Egypt 1.956459 1 
OL351371.1 27 49 22 Egypt 1.956 2614 


each nucleotide can encode maximum 2 bits: A = 00, C = 01, G = 10, 
and T = 11. Fora given genomic sequence containing N nucleotides, 
the total Shannon information entropy can be maximum 2N bits. 

The reference RNA sequence of the SARS-CoV-2, represent- 
ing a sample of the virus collected early in the pandemic in Wuhan, 
China in December 2019 (MN908947),'° has 29 903 nucleotides, so 
N = 29 903. For this reference sequence, we computed the Shannon 
information entropy using relation (2). 

The value obtained represents the reference Shannon infor- 
mation entropy at time zero before any mutations took place. 
Using the National Center for Biotechnology Information (NCBI) 
database, we searched and extracted a number of SARS-CoV-2 vari- 
ants sequenced at various locations around the globe, at different 
times, starting from January 2020 to October 2021 (Table 1). 


405704 Linear fit 
S_./k 
40565 inf b 
40560 
40555 |g Information entropy - 
T T ¥ . 
45 
30 
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@ Number of mutations 
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FIG. 3. Time evolution of the number of genetic mutations (bottom graph) and the 
entropy of information bearing states, Sj, normalized to kp (top graph), of selected 
sequences of SARS-CoV-2 virus. Covid-19 virus photo by Centers for Disease 
Control (CDC) and imported from unsplash.com. 
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By searching for complete genome sequences, containing the 
same number of nucleotides as the reference sequence, we care- 
fully selected variants that displayed an incremental number of SNP 
mutations with time, and we computed the Shannon information 
entropy for each variant. The calculations have been performed 
using previously developed software, GENIES,'”’’ designed to study 
genetic mutations using Shannon’s information theory." 

The full dataset, including genome data references/links collec- 
tion times, number of SNP mutations, and the Shannon information 
entropy value of each genome are shown in Table I. Figure 3 shows 
the time evolution of the number of SARS-CoV-2 SNP mutations 
and the time evolution of each variant’s information entropy, Sjny 
computed using (4) and normalized to ky. The data indicate that, as 
expected, the number of mutations increases linearly as a function 
of time [Fig. 3, bottom graph, coefficient of determination (COD) 
= 99%]. Remarkably, for the same dataset, the Shannon informa- 
tion entropy (see Table I), and the overall information entropy of 
the SARS-CoV-2 variants (S;,) computed using (4), decreases rather 
linearly over time (Fig. 3, top graph, COD = 97%). The observed 
correlation between the information entropy and the time dynam- 
ics of the genetic mutations is truly unique, because it reconfirms 
the second law of infodynamics, but it also points to a possible 
deterministic approach to genetic mutations, currently believed to 
be just random events. The existence of an entopic force that gov- 
erns genetic mutations instead of randomness is very powerful and 
it could lead to the future development of predictive algorithms for 
genetic mutations before they occur. 


V. CONCLUSIONS 


In this study, we introduced the second law of infodynamics, 
which is universally applicable to any information system, includ- 
ing biological systems where the number of information states 
remains constant. We demonstrated that the information bear- 
ing states evolve over time in a way that their associated entropy 
remains constant or decreases. Hence, all physical systems con- 
taining information states should obey not only the second law 
of thermodynamics but also the second law of infodynamics, as 
demonstrated in this article. The introduction of the second law of 
infodynamics is of fundamental importance because it will aid future 
studies and developments in a diverse range of sciences, including 
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genetics, evolutionary biology, virology, computing, big data, 
physics, and cosmology. However, in this article, we do not address 
the implications of the second law of infodynamics to fundamental 
issues such as the evolution of information in the universe, the over- 
all balance of physical and information entropies in the universe, 
or the growth of biological information in the terrestrial biosphere 
and beyond. We also do not explain how the second law of info- 
dynamics relates to the relaxation times of the information states 
and the observation time, nor do we address the question of the 
possible existence of fluctuations of information states when the 
minimal information entropy state occurs. We, therefore, hope that 
these unanswered questions will be addressed in the future studies 
stimulated by this work. 
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